[WIP] Standard baseline configs by jemrobinson · Pull Request #324 · alan-turing-institute/icenet-mp

jemrobinson · 2026-06-25T13:31:23Z

This PR adds some baseline configs for the model comparisons discussed in #323.

TL;DR

nothing is beating persistence on a two day forecast (perhaps we want to look at e.g. 7 days?)
some very weird aliasing features on piecewise and vit that make me feel that we could better-optimise these defaults
is combining train and validation on the metric plots actually useful or is it unnecessary noise?

Models

00: naive-unet-naive

50 epochs
northern hemisphere: glorious-morning-882
southern hemisphere: deft-sponge-888

01: persistence

50 epochs
northern hemisphere: frosty-voice-895
southern hemisphere sunny-bird-897

02: cnn-unet-cnn

50 epochs
northern hemisphere: zesty-cosmos-894
southern hemisphere sleek-sea-896

03: cnn-vit-cnn

50 epochs
northern hemisphere: clear-violet-885
southern hemisphere fragrant-paper-891

04: ddpm

34/35 epochs (hit 24h walltime limit)
northern hemisphere: solar-donkey-886
southern hemisphere toasty-rain-893

05: piecewise-unet-piecewise-5336004.out

50 epochs
northern hemisphere: lemon-deluge-887
southern hemisphere lyric-gorge-892

Metrics

RMSE

SIE Error

MAE

…ike Persistence

…unet_cnn

IFenton · 2026-06-26T11:48:18Z

Couple of quick thoughts.

nothing is beating persistence on a two day forecast (perhaps we want to look at e.g. 7 days?)

Looking at 7 days seems sensible anyway, as you can then get an idea of how the forecast accuracy develops over time. (And as you say it is more likely to beat persistence).

As we've discussed a bit elsewhere (#323), doing a bit of work to optimise the models before we do this seems sensible. E.g. Erin's trained ViT model (https://wandb.ai/turing-seaice/evaluate/runs/0r9jomqk) does beat persistence

jemrobinson · 2026-06-26T12:12:39Z

For reference, @erinuclkwon's model was using

n_forecast_steps: 7
n_history_steps: 3

which is what the next round of testing will use.

github-actions · 2026-06-26T15:48:10Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
icenet_mp
model_service.py					31-33, 43-45, 213-221
icenet_mp/callbacks
ema_weight_averaging_callback.py					29-30, 36-37, 41-42
metric_summary_callback.py					29, 62
icenet_mp/compatibility
__init__.py
icenet_mp/data_processors
data_downloader.py					190-197, 202-212, 231-235, 245-249
icenet_mp/models
base_model.py					95-96
persistence.py
icenet_mp/models/common
conv_block_downsample.py
conv_block_upsample.py
icenet_mp/models/decoders
cnn_decoder.py
icenet_mp/models/encoders
cnn_encoder.py
icenet_mp/models/processors
vit.py					41-42
icenet_mp/types
__init__.py
simple_datatypes.py
Project Total

_{This report was generated by python-coverage-comment-action}

…figs

jemrobinson added 15 commits June 18, 2026 08:06

🐛 Check for appropriate patch size during VitProcessor initialisation

7c8ef0a

🚚 Rename baseline configs

4a3ff5e

⚰️ Remove non-baseline configs

df16775

📝 Better mask creation output

0a6d457

👽 Add missing cleanup step in dataset downloading

a095677

👽 Catch recipe failures during artifact cleanup

dda6dbf

📝 Reduce unnecessary logging

511a23d

⚗️ Check for artifacts as part of finalisation

c547ba3

🐛 Allow EMAWeightAveragingCallback to support parameter-less models l…

5275b0d

…ike Persistence

✨ Make scale factor configurable in CNNEncoder/CNNDecoder

7746af0

🔧 Fix CNN-UNet-CNN config file

f69edc4

🐛 Do not log metrics before they have been updated

95c58cd

💡 Drop configuration message to debug as this appears on each worker

cb22696

🐛 Use most recent timestep for persistence

1921ddb

♻️ Rename _loss_fn to loss_fn

bb39cf5

jemrobinson changed the title ~~Standard baseline configs~~ [WIP] Standard baseline configs Jun 25, 2026

jemrobinson marked this pull request as draft June 25, 2026 13:39

jemrobinson mentioned this pull request Jun 25, 2026

What is the process for doing a new release? #323

Open

jemrobinson added 4 commits June 25, 2026 15:08

🔧 Increase naive-unet-naive start_out_channels and bound the output

492318d

🔧 Increase kernel_size, start_out_channels and added bounding to cnn_…

230c6cd

…unet_cnn

🔧 Increase ViT size and add decoder bounding to cnn_vit_cnn

a5e96bb

🔧 Increase DDPM size and reduce dropout in ddpm

9a313c9

🎨 Tidy up use of fully_deterministic

b545ae1

jemrobinson force-pushed the add-baseline-configs branch from 57f3cfb to b545ae1 Compare June 26, 2026 15:26

✅ Fix residual use of _loss_fn in testing

5f85626

jemrobinson added 2 commits June 27, 2026 20:41

🔧 Increase complexity of ViT processor

467a674

🔧 Minor tweaks to cnn-unet-cnn, ddpm and piecewise-unet-piecewise con…

4dbf8d7

…figs

jemrobinson force-pushed the add-baseline-configs branch from 421c0b8 to 4dbf8d7 Compare June 29, 2026 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Standard baseline configs#324

[WIP] Standard baseline configs#324
jemrobinson wants to merge 23 commits into
mainfrom
add-baseline-configs

jemrobinson commented Jun 25, 2026 •

edited

Loading

Uh oh!

IFenton commented Jun 26, 2026

Uh oh!

jemrobinson commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jemrobinson commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

Models

00: naive-unet-naive

01: persistence

02: cnn-unet-cnn

03: cnn-vit-cnn

04: ddpm

05: piecewise-unet-piecewise-5336004.out

Metrics

RMSE

SIE Error

MAE

Uh oh!

IFenton commented Jun 26, 2026

Uh oh!

jemrobinson commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Coverage report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jemrobinson commented Jun 25, 2026 •

edited

Loading